Skip to content

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Jan 20, 2026

Closes #697

This PR is motivated by nvbug 5808967 / #1525 — currently there is no automatic testing for the cuda_bindings/examples at all, therefore it is possible that the QA team is side-tracked unnecessarily by failures that we can discover in the CI here automatically.

This PR enables running the cuda_bindings/examples in wheel-based test environments, which is both a gain and a simplification (see changes in cuda_bindings/examples/common/common.py).

Non-goal for this PR: structural changes to run the examples in various environments or in different ways.

Note that scripts/run_tests.sh runs the examples by default. This PR makes scripts/run_tests.sh succeed in local wheel-based environments (except for one unrelated failure in cuda_core, please ignore for the purpose of this PR):

$ grep 'FAILED examples' scripts_run_tests_main_log_2026-01-22+093402.txt
FAILED examples/2_Concepts_and_Techniques/streamOrderedAllocation_test.py::main
FAILED examples/3_CUDA_Features/simpleCudaGraphs_test.py::main - AssertionError
FAILED examples/0_Introduction/clock_nvrtc_test.py::main - AssertionError
FAILED examples/0_Introduction/simpleZeroCopy_test.py::main - AssertionError
FAILED examples/0_Introduction/simpleCubemapTexture_test.py::main - Assertion...
FAILED examples/0_Introduction/systemWideAtomics_test.py::main - AssertionError
FAILED examples/0_Introduction/vectorAddDrv_test.py::main - AssertionError
FAILED examples/0_Introduction/vectorAddMMAP_test.py::main - AssertionError
$ grep 'FAILED examples' scripts_run_tests_pr1517_log_2026-01-22+100143.txt

(no output)


For completeness: I was hoping that our CI reproduces the Python 3.14t failure reported under nvbug 5808967, but it does not. But at least we know now, which helps in the search for the root cause.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Jan 20, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk rwgk self-assigned this Jan 20, 2026
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 20, 2026

/ok to test

@github-actions
Copy link

Analysis:
- examples were invoked via `python -m pytest` from within `cuda_bindings`
  so the repo checkout was on sys.path and imports resolved to the source tree
- `setuptools_scm` generates `cuda/bindings/_version.py` only in the built wheel,
  so the source tree lacks this file and `from cuda.bindings._version import __version__`
  fails during example collection
- running `pytest` via the installed entrypoint avoids CWD precedence and keeps
  imports coming from the installed wheel, which includes the generated version file

Change:
- switch Linux and Windows example steps to call `pytest` entrypoint
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 20, 2026

/ok to test

Copy link
Contributor

@mdboom mdboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only gets the example tests running in CI, not all the other ways that tests get run (calling pytest directly, or pixi run test etc.) It would be preferable to do this at a higher level -- maybe it's possible to make a symlink from cuda_bindings/tests/ to cuda_bindings/examples -- so that they will be included even for local development.

@rwgk rwgk force-pushed the ci_cuda_bindings_examples branch from e34a864 to 7fa3f76 Compare January 22, 2026 16:41
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 22, 2026

/ok to test

@leofang
Copy link
Member

leofang commented Jan 22, 2026

It would be preferable to do this at a higher level -- maybe it's possible to make a symlink from cuda_bindings/tests/ to cuda_bindings/examples -- so that they will be included even for local development.

Let's follow what I set up in cuda-core. All cuda-core examples are run as part of the regular tests, either locally or in the CI.

@leofang
Copy link
Member

leofang commented Jan 22, 2026

@rwgk for reproducing the bug, try replacing 3.13 here

- { ARCH: 'amd64', PY_VER: '3.13', CUDA_VER: '13.1.0', LOCAL_CTK: '1', GPU: 'h100', GPU_COUNT: '2', DRIVER: 'latest' }

with 3.14t so that we use free-threading Python.

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 22, 2026

Let's follow what I set up in cuda-core. All cuda-core examples are run as part of the regular tests, either locally or in the CI.

CC @rparolin since this is about priorities: This is turning into a bigger project than expected. What should I do?

I agree the cuda-core approach is much better than what we have now, but this PR provides immediate CI coverage without blocking that direction.

  • I started this work expecting it'd be quick and help inform dealing with QA failures (nvbug 5808967).
  • Now a very different direction is requested, even though this PR is very small and doesn't create difficulties for the different direction.
  • It's an unknown to me how long it'll take to make the structural changes, especially at the moment: I have not gotten a complete CI run even for this easy PR, opened two days ago; currently rtxpro6000 jobs sit in the queue for hours.

My preference: Merge this PR and create a new issue for future proper prioritization: Rework organization of cuda_bindings/examples

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 22, 2026

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 22, 2026

for reproducing the bug, try replacing 3.13 here

- { ARCH: 'amd64', PY_VER: '3.13', CUDA_VER: '13.1.0', LOCAL_CTK: '1', GPU: 'h100', GPU_COUNT: '2', DRIVER: 'latest' }

with 3.14t so that we use free-threading Python.

Done: commit dbfa3db

@leofang
Copy link
Member

leofang commented Jan 22, 2026

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 22, 2026

Here is our reproducer 🙂 https://github.com/NVIDIA/cuda-python/actions/runs/21267230884/job/61209868639?pr=1517#step:27:266

Awesome! I'll work on taking care of that now, to get the nvbug out of limbo.

rwgk added 2 commits January 22, 2026 16:06
Keep pointer arrays alive through launches to avoid free-threaded Python
misaligned-address failures caused by temporary argument buffers.
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 23, 2026

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 23, 2026

Wow, so many unrelated flakes! Five, I looked at all of them. The one we're most interested in just says "This job failed" with a big X, it didn't even start up.

Currently there is no job running, only 34 queued. I'll cancel and rerun. Nothing lost, but hopefully we'll get more lucky with the infrastructure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cuda-bindings examples are not run as part of the CI

3 participants